<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic
  PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<topic id="topic1">
        <title id="bu20941">CREATE INDEX</title>
        <body>
                <p id="sql_command_desc">Defines a new index.</p>
                <section id="section2">
                        <title>Synopsis</title>
                        <codeblock id="sql_command_synopsis">CREATE [UNIQUE] INDEX [<varname>name</varname>] ON <varname>table_name</varname> [USING <varname>method</varname>]
       ( {<varname>column_name</varname> | (<varname>expression</varname>)} [COLLATE <varname>parameter</varname>] [<varname>opclass</varname>] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
       [ WITH ( <varname>storage_parameter</varname> = <varname>value</varname> [, ... ] ) ]
       [ TABLESPACE <varname>tablespace</varname> ]
       [ WHERE <varname>predicate</varname> ]</codeblock>
                </section>
                <section id="section3">
                        <title>Description</title>
                        <p><codeph>CREATE INDEX</codeph> constructs an index on the specified
                                column(s) of the specified table or materialized view. Indexes are
                                primarily used to enhance database performance (though inappropriate
                                use can result in slower performance). </p>
                        <p>The key field(s) for the index are specified as column names, or
                                alternatively as expressions written in parentheses. Multiple fields
                                can be specified if the index method supports multicolumn
                                indexes.</p>
                        <p>An index field can be an expression computed from the values of one or
                                more columns of the table row. This feature can be used to obtain
                                fast access to data based on some transformation of the basic data.
                                For example, an index computed on <codeph>upper(col)</codeph> would
                                allow the clause <codeph>WHERE upper(col) = 'JIM'</codeph> to use an
                                index. </p>
                        <p>Greenplum Database provides the index methods B-tree, bitmap, GiST,
                                SP-GiST, and GIN. Users can also define their own index methods, but
                                that is fairly complicated. </p>
                        <p>When the <codeph>WHERE</codeph> clause is present, a partial index is
                                created. A partial index is an index that contains entries for only
                                a portion of a table, usually a portion that is more useful for
                                indexing than the rest of the table. For example, if you have a
                                table that contains both billed and unbilled orders where the
                                unbilled orders take up a small fraction of the total table and yet
                                is most often selected, you can improve performance by creating an
                                index on just that portion.</p>
                        <p>The expression used in the <codeph>WHERE</codeph> clause may refer only
                                to columns of the underlying table, but it can use all columns, not
                                just the ones being indexed. Subqueries and aggregate expressions
                                are also forbidden in <codeph>WHERE</codeph>. The same restrictions
                                apply to index fields that are expressions.</p>
                        <p>All functions and operators used in an index definition must be
                                immutable. Their results must depend only on their arguments and
                                never on any outside influence (such as the contents of another
                                table or a parameter value). This restriction ensures that the
                                behavior of the index is well-defined. To use a user-defined
                                function in an index expression or <codeph>WHERE</codeph> clause,
                                remember to mark the function <codeph>IMMUTABLE</codeph> when you
                                create it.</p>
                </section>
                <section id="section4">
                        <title>Parameters</title>
                        <parml>
                                <plentry>
                                        <pt>UNIQUE</pt>
                                        <pd>Checks for duplicate values in the table when the index
                                                is created and each time data is added. Duplicate
                                                entries will generate an error. Unique indexes only
                                                apply to B-tree indexes. In Greenplum Database,
                                                unique indexes are allowed only if the columns of
                                                the index key are the same as (or a superset of) the
                                                Greenplum distribution key. On partitioned tables, a
                                                unique index is only supported within an individual
                                                partition - not across all partitions.</pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>name</varname>
                                        </pt>
                                        <pd>The name of the index to be created. The index is always
                                                created in the same schema as its parent table. If
                                                the name is omitted, Greenplum Database chooses a
                                                suitable name based on the parent table's name and
                                                the indexed column name(s). </pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>table_name</varname>
                                        </pt>
                                        <pd>The name (optionally schema-qualified) of the table to
                                                be indexed. </pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>method</varname>
                                        </pt>
                                        <pd>The name of the index method to be used. Choices are
                                                  <codeph>btree</codeph>, <codeph>bitmap</codeph>,
                                                  <codeph>gist</codeph>, <codeph>spgist</codeph>,
                                                and <codeph>gin</codeph>. The default method is
                                                  <codeph>btree</codeph>. </pd>
                                        <pd>Currently, only the B-tree, GiST, and GIN index methods
                                                support multicolumn indexes. Up to 32 fields can be
                                                specified by default. Only B-tree currently supports
                                                unique indexes.</pd>
                                        <pd>GPORCA supports only B-tree, bitmap, GiST, and GIN indexes.
                                                GPORCA ignores indexes created with unsupported
                                                indexing methods.</pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>column_name</varname>
                                        </pt>
                                        <pd>The name of a column of the table on which to create the
                                                index. Only the B-tree, bitmap, GiST, and GIN index
                                                methods support multicolumn indexes.</pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>expression</varname>
                                        </pt>
                                        <pd>An expression based on one or more columns of the table.
                                                The expression usually must be written with
                                                surrounding parentheses, as shown in the syntax.
                                                However, the parentheses may be omitted if the
                                                expression has the form of a function call.</pd>
                                </plentry>
                                <plentry>
                                        <pt><varname>collation</varname></pt>
                                        <pd> The name of the collation to use for the index. By
                                                default, the index uses the collation declared for
                                                the column to be indexed or the result collation of
                                                the expression to be indexed. Indexes with
                                                non-default collations can be useful for queries
                                                that involve expressions using non-default
                                                collations. </pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>opclass</varname>
                                        </pt>
                                        <pd>The name of an operator class. The operator class
                                                identifies the operators to be used by the index for
                                                that column. For example, a B-tree index on
                                                four-byte integers would use the
                                                  <codeph>int4_ops</codeph> class (this operator
                                                class includes comparison functions for four-byte
                                                integers). In practice the default operator class
                                                for the column's data type is usually sufficient.
                                                The main point of having operator classes is that
                                                for some data types, there could be more than one
                                                meaningful ordering. For example, a complex-number
                                                data type could be sorted by either absolute value
                                                or by real part. We could do this by defining two
                                                operator classes for the data type and then
                                                selecting the proper class when making an
                                                index.</pd>
                                </plentry>
                                <plentry>
                                        <pt>ASC</pt>
                                        <pd>Specifies ascending sort order (which is the
                                                default).</pd>
                                </plentry>
                                <plentry>
                                        <pt>DESC</pt>
                                        <pd>Specifies descending sort order.</pd>
                                </plentry>
                                <plentry>
                                        <pt>NULLS FIRST</pt>
                                        <pd>Specifies that nulls sort before non-nulls. This is the
                                                default when <codeph>DESC</codeph> is
                                                specified.</pd>
                                </plentry>
                                <plentry>
                                        <pt>NULLS LAST</pt>
                                        <pd>Specifies that nulls sort after non-nulls. This is the
                                                default when <codeph>DESC</codeph> is not
                                                specified.</pd>
                                </plentry>
                                <plentry>
                                        <pt><varname>storage_parameter</varname></pt>
                                        <pd>The name of an index-method-specific storage parameter.
                                                Each index method has its own set of allowed storage
                                                parameters.</pd>
                                        <pd><codeph>FILLFACTOR</codeph> - B-tree, bitmap, GiST, and
                                                SP-GiST index methods all accept this parameter. The
                                                  <codeph>FILLFACTOR</codeph> for an index is a
                                                percentage that determines how full the index method
                                                will try to pack index pages. For B-trees, leaf
                                                pages are filled to this percentage during initial
                                                index build, and also when extending the index at
                                                the right (adding new largest key values). If pages
                                                subsequently become completely full, they will be
                                                split, leading to gradual degradation in the index's
                                                efficiency. B-trees use a default fillfactor of 90,
                                                but any integer value from 10 to 100 can be
                                                selected. If the table is static then fillfactor 100
                                                is best to minimize the index's physical size, but
                                                for heavily updated tables a smaller fillfactor is
                                                better to minimize the need for page splits. The
                                                other index methods use fillfactor in different but
                                                roughly analogous ways; the default fillfactor
                                                varies between methods.</pd>
                                        <pd><codeph>BUFFERING</codeph> - In addition to
                                                  <codeph>FILLFACTOR</codeph>, GiST indexes
                                                additionally accept the <codeph>BUFFERING</codeph>
                                                parameter. <codeph>BUFFERING</codeph> determines
                                                whether Greenplum Database builds the index using
                                                the buffering build technique described in <xref
                                                  href="https://www.postgresql.org/docs/9.4/gist-implementation.html"
                                                  format="html" scope="external">GiST buffering
                                                  build</xref> in the PostgreSQL documentation. With
                                                  <codeph>OFF</codeph> it is disabled, with
                                                  <codeph>ON</codeph> it is enabled, and with
                                                  <codeph>AUTO</codeph> it is initially disabled,
                                                but turned on on-the-fly once the index size reaches
                                                  <xref
                                                  href="../config_params/guc-list.xml#effective_cache_size"
                                                  >effective-cache-size</xref>. The default is
                                                  <codeph>AUTO</codeph>.</pd>
                                        <pd><codeph>FASTUPDATE</codeph> - The GIN index method
                                                accepts the <codeph>FASTUPDATE</codeph> storage
                                                parameter. <codeph>FASTUPDATE</codeph> is a Boolean
                                                parameter that disables or enables the GIN index
                                                fast update technique. A value of ON enables fast
                                                update (the default), and OFF disables it. See <xref
                                                  href="https://www.postgresql.org/docs/9.4/gin-implementation.html#GIN-FAST-UPDATE"
                                                  format="html" scope="external">GIN fast update
                                                  technique</xref> in the PostgreSQL documentation
                                                for more information.<note>Turning
                                                  <codeph>FASTUPDATE</codeph> off via <codeph>ALTER
                                                  INDEX</codeph> prevents future insertions from
                                                  going into the list of pending index entries, but
                                                  does not in itself flush previous entries. You
                                                  might want to VACUUM the table afterward to ensure
                                                  the pending list is emptied.</note></pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>tablespace_name</varname>
                                        </pt>
                                        <pd>The tablespace in which to create the index. If not
                                                specified, the default tablespace is used, or <xref
                                                  href="../config_params/guc-list.xml#topic_k52_fqm_f3b"
                                                /> for indexes on temporary tables. </pd>
                                </plentry>
                                <plentry>
                                        <pt>
                                                <varname>predicate</varname>
                                        </pt>
                                        <pd>The constraint expression for a partial index.</pd>
                                </plentry>
                        </parml>
                </section>
                <section id="section5">
                        <title>Notes</title>
                        <p>An <i>operator class</i> can be specified for each column of an index.
                                The operator class identifies the operators to be used by the index
                                for that column. For example, a B-tree index on four-byte integers
                                would use the int4_ops class; this operator class includes
                                comparison functions for four-byte integers. In practice the default
                                operator class for the column's data type is usually sufficient. The
                                main point of having operator classes is that for some data types,
                                there could be more than one meaningful ordering. For example, we
                                might want to sort a complex-number data type either by absolute
                                value or by real part. We could do this by defining two operator
                                classes for the data type and then selecting the proper class when
                                making an index. </p>
                        <p>For index methods that support ordered scans (currently, only B-tree),
                                the optional clauses <codeph>ASC</codeph>, <codeph>DESC</codeph>,
                                        <codeph>NULLS FIRST</codeph>, and/or <codeph>NULLS
                                        LAST</codeph> can be specified to modify the sort ordering
                                of the index. Since an ordered index can be scanned either forward
                                or backward, it is not normally useful to create a single-column
                                        <codeph>DESC</codeph> index — that sort ordering is already
                                available with a regular index. The value of these options is that
                                multicolumn indexes can be created that match the sort ordering
                                requested by a mixed-ordering query, such as <codeph>SELECT ...
                                        ORDER BY x ASC, y DESC</codeph>. The <codeph>NULLS</codeph>
                                options are useful if you need to support "nulls sort low" behavior,
                                rather than the default "nulls sort high", in queries that depend on
                                indexes to avoid sorting steps.</p>
                        <p>For most index methods, the speed of creating an index is dependent on
                                the setting of <codeph>maintenance_work_mem</codeph>. Larger values
                                will reduce the time needed for index creation, so long as you don't
                                make it larger than the amount of memory really available, which
                                would drive the machine into swapping.</p>
                        <p>When an index is created on a partitioned table, the index is propagated
                                to all the child tables created by Greenplum Database. Creating an
                                index on a table that is created by Greenplum Database for use by a
                                partitioned table is not supported.</p>
                        <p><codeph>UNIQUE</codeph> indexes are allowed only if the index columns are
                                the same as (or a superset of) the Greenplum distribution key
                                columns.</p>
                        <p><codeph>UNIQUE</codeph> indexes are not allowed on append-optimized
                                tables.</p>
                        <p>A <codeph>UNIQUE</codeph> index can be created on a partitioned table.
                                However, uniqueness is enforced only within a partition; uniqueness
                                is not enforced between partitions. For example, for a partitioned
                                table with partitions that are based on year and a subpartitions
                                that are based on quarter, uniqueness is enforced only on each
                                individual quarter partition. Uniqueness is not enforced between
                                quarter partitions</p>
                        <p>Indexes are not used for <codeph>IS NULL</codeph> clauses by default. The
                                best way to use indexes in such cases is to create a partial index
                                using an <codeph>IS NULL</codeph> predicate. </p>
                        <p><codeph>bitmap</codeph> indexes perform best for columns that have
                                between 100 and 100,000 distinct values. For a column with more than
                                100,000 distinct values, the performance and space efficiency of a
                                bitmap index decline. The size of a bitmap index is proportional to
                                the number of rows in the table times the number of distinct values
                                in the indexed column. </p>
                        <p>Columns with fewer than 100 distinct values usually do not benefit much
                                from any type of index. For example, a gender column with only two
                                distinct values for male and female would not be a good candidate
                                for an index. </p>
                        <p>Prior releases of Greenplum Database also had an R-tree index method.
                                This method has been removed because it had no significant
                                advantages over the GiST method. If <codeph>USING rtree</codeph> is
                                specified, <codeph>CREATE INDEX</codeph> will interpret it as
                                        <codeph>USING gist</codeph>. </p>
                        <p>For more information on the GiST index type, refer to the <xref
                                        href="https://www.postgresql.org/docs/9.4/indexes-types.html"
                                        scope="external" format="html">PostgreSQL
                                        documentation</xref>.</p>
                        <p>The use of hash indexes has been disabled in Greenplum Database.</p>
                </section>
                <section id="section6">
                        <title>Examples</title>
                        <p>To create a B-tree index on the column <codeph>title</codeph> in the
                                table <codeph>films</codeph>:</p>
                        <codeblock>CREATE UNIQUE INDEX title_idx ON films (title);</codeblock>
                        <p>To create a bitmap index on the column <codeph>gender</codeph> in the
                                table <codeph>employee</codeph>:</p>
                        <codeblock>CREATE INDEX gender_bmp_idx ON employee USING bitmap 
(gender);</codeblock>
                        <p>To create an index on the expression <codeph>lower(title)</codeph>,
                                allowing efficient case-insensitive searches:</p>
                        <codeblock>CREATE INDEX ON films ((lower(title)));</codeblock>
                        <p>(In this example we have chosen to omit the index name, so the system
                                will choose a name, typically <codeph>films_lower_idx</codeph>.)
                        </p>
                        <p> To create an index with non-default collation: </p>
                        <codeblock>CREATE INDEX title_idx_german ON films (title COLLATE "de_DE");</codeblock>
                        <p>To create an index with non-default fill factor:</p>
                        <codeblock>CREATE UNIQUE INDEX title_idx ON films (title) WITH 
(fillfactor = 70);</codeblock>
                        <p>To create a GIN index with fast updates disabled:</p>
                        <codeblock>CREATE INDEX gin_idx ON documents_table USING gin (locations) WITH (fastupdate = off);</codeblock>
                        <p>To create an index on the column <codeph>code</codeph> in the table
                                        <codeph>films</codeph> and have the index reside in the
                                tablespace <codeph>indexspace</codeph>:</p>
                        <codeblock>CREATE INDEX code_idx ON films(code) TABLESPACE indexspace;</codeblock>
                        <p>To create a GiST index on a point attribute so that we
                                can efficiently use box operators on the result of the
                                conversion function:</p>
                        <codeblock>CREATE INDEX pointloc ON points USING gist (box(location,location));
SELECT * FROM points WHERE box(location,location) &amp;&amp; '(0,0),(1,1)'::box;</codeblock>
                </section>
                <section id="section7">
                        <title>Compatibility</title>
                        <p><codeph>CREATE INDEX</codeph> is a Greenplum Database language extension.
                                There are no provisions for indexes in the SQL standard. </p>
                        <p>Greenplum Database does not support the concurrent creation of indexes
                                        (<codeph>CONCURRENTLY</codeph> keyword not supported).</p>
                </section>
                <section id="section8">
                        <title>See Also</title>
                        <p><codeph><xref href="ALTER_INDEX.xml#topic1" type="topic" format="dita"
                                        /></codeph>, <codeph><xref href="./DROP_INDEX.xml#topic1"
                                                type="topic" format="dita"/></codeph>, <codeph><xref
                                                href="./CREATE_TABLE.xml#topic1" type="topic"
                                                format="dita"/></codeph>, <codeph><xref
                                                href="./CREATE_OPERATOR_CLASS.xml#topic1"
                                                type="topic" format="dita"/></codeph></p>
                </section>
        </body>
</topic>
